--- title: "Temporal variation in transmission during the COVID-19 outbreak" description: "To identify changes in the reproduction number, rate of spread, and doubling time during the course of the COVID-19 outbreak whilst accounting for potential biases due to delays in case reporting." status: in-progress rmarkdown_html_fragment: true redirect_from: - /topics/covid19/current-patterns-transmission/global-time-varying-transmission.html update: 2020-03-05 authors: - id: sam_abbott corresponding: true - id: joel_hellewell - id: june_chun - id: nikos_bosse - id: yung_wai - id: tim_russell - id: james_munday - id: ncov-group - id: stefan_flasche - id: adam_kucharski - id: roz_eggo - id: seb_funk ---
Note: this is preliminary analysis, has not yet been peer-reviewed and is updated daily as new data becomes available.
Aim: To identify changes in the reproduction number, rate of spread, and doubling time during the course of the COVID-19 outbreak whilst accounting for potential biases due to delays in case reporting.
Latest estimates as of the 2020-03-04
Figure 1: Time-varying estimate of the effective reproduction no. (light grey ribbon = 95% credible interval; dark grey ribbon = the interquartile range) based on data from the 2020-03-04 in each region considered in the analysis. Confidence in the estimated values is indicated by shading with reduced shading corresponding to reduced confidence. The dotted line indicates the target value of 1 for the effective reproduction no. required for control. Note: Data is only shown from the 18th of February onwards for Hubei due to changes in reporting.
| Country | Cases with date of onset on the day of report generation | Effective reproduction no. | Doubling time (days) |
|---|---|---|---|
| France | 5 – 53 | 1.7 – 2.6 | 3 – Decreasing |
| Germany | 19 – 86 | 1.1 – 1.9 | 1.3 – 13 |
| Hong Kong | 1 – 28 | 0.4 – 2.7 | 0.11 – Decreasing |
| Hubei | 46 – 204 | 0.5 – 0.6 | Decreasing – Decreasing |
| Italy | 406 – 676 | 1.8 – 2.2 | 3.6 – 10 |
| Japan | 1 – 53 | 0.7 – 1.4 | 3.3 – Decreasing |
| Singapore | 1 – 21 | 0.6 – 2.7 | 0.15 – Decreasing |
| South Korea | 394 – 693 | 1.1 – 1.3 | 26 – Decreasing |
| Spain | 15 – 74 | 1.4 – 2.6 | 1.8 – 13 |
Table 1: Latest estimates of the number of cases by date of onset, the effective reproduction number, and the doubling time for the 2020-03-04 in each region included in the analysis. Based on the last 7 days of data. The 95% credible interval is shown for each estimate.
We used partial line-lists from each region that contained the date of symptom onset, date of confirmation and import status (imported or local) for each case [3]. Line-list data was only available until the 18th of February for Singapore. A line-list was not available for Hubei. Daily case counts by date of report were extracted from the World Health Organization (WHO) situation reports for every location considered [1,2]. The case counts (and partial line-lists where available) were used to assemble the daily number of local and imported cases. Where the partial line-lists and case counts disagreed, it was assumed that the partial line-lists were correct and the WHO case counts were adjusted so that the overall number of cases occurring remained the same but the number of local cases being adjusted as needed.
Reporting delays for each country were estimated using the corresponding partial line-list of cases. The reporting delay could not be estimated from line-list data for Italy, France, and Spain. For these countries the reporting delay was estimated using a combined European linelist (including cases from Germany, France, Italy and Spain). A reporting delay for Hubei was estimated using an all China line-list. The estimated reporting delay was assumed to remain constant over time in each location. We fitted an exponential distribution adjusted for censoring [7] to the observed delays using stan [8]. We then took 1000 samples from the posterior distribution of the rate parameter for the exponential delay distribution and constructed a distribution of possible onset dates for each case based on their reporting date. To prevent spuriously long reporting delays, we re-sampled delays that were greater than the maximum observed delay in the observed data.
To account for censoring, i.e. cases that have not yet been confirmed but will show up in the data at a later time, we randomly sampled the true number of cases (including those not yet confirmed) assuming that the reported number of cases is drawn from a binomial distribution, where each case has independent probability \(p_i\) of having been confirmed, \(i\) is the number of days of the symptom onset before the report maximum observed report delay, and \(p_i\) is the cumulative distribution of cases that are confirmed by day \(i\) after they develop symptoms. We did not account for potential reporting biases that might occur due to changes in the growth rate of the outbreak over time.
We used the inferred number of cases to estimate the reproduction number on each day using the EpiEstim R package [4]. This uses a combination of the serial interval distribution and the number of observed cases to estimate the reproduction number at each time point [10,11], which were then smoothed using a 7-day time window. We assumed that the serial interval had a mean of 4.7 days and a standard deviation of 2.9 days with a Gamma distribution [6]. Where data was available, we used EpiEstim to adjust for imported cases [5]. The probability of control was estimated using the proportion of samples with a reproduction number less than 1.
We estimated the rate of spread (\(r\)) using linear regression with time as the only exposure and logged cases as the outcome for the overall course of the outbreak [12]. The adjusted R^2 value was then used to assess the goodness of fit. In order to account for potential changes in the rate of spread over the course of the outbreak we used a 7-day sliding window to produce time-varying estimates of the rate of spread and the adjusted R^2. The doubling time was then estimated using \(\text{ln}(2) \frac{1}{r}\) for each estimate of the rate of spread.
We report the 95% confidence intervals for all measures using the 2.5% and 97.5% quantiles. The analysis was conducted independently for all regions and is updated daily as new data becomes available. Confidence in our estimates is shown using the proportion of data that were derived using binomial upscaling.
| Estimate | |
|---|---|
| Cases with date of onset on the day of report generation | 5 – 53 |
| Effective reproduction no. | 1.7 – 2.6 |
| Rate of spread | -0.28 – 0.23 |
| Doubling time (days) | 3 – Decreasing |
| Adjusted R-squared | -0.28 – 0.64 |
Table 2: Latest estimates of the number of cases by date of onset, the effective reproduction number, the rate of spread, the doubling time, and the adjusted R-squared of the exponential fit for the 2020-03-04. Based on the last 7 days of data. The 95% credible interval is shown for each estimate.
Figure 2: Cases by date of report (bars) and estimated cases by date of onset (light grey ribbon = 95% credible interval; dark grey ribbon = the interquartile range) based on data from the 2020-03-04. Confidence in the estimated values is indicated by shading with reduced shading corresponding to reduced confidence.
Figure 3: Time-varying estimate of the effective reproduction no. (light grey ribbon = 95% credible interval; dark grey ribbon = the interquartile range) based on data from the 2020-03-04. Confidence in the estimated values is indicated by shading with reduced shading corresponding to reduced confidence. The dotted line indicates the target value of 1 for the effective reproduction no. required for control.
Figure 4: A.) Time-varying estimate of the rate of spread, B.) Time-varying estimate of the doubling time in days (note that when the rate of spread is negative the doubling time is assumed to be infinite), C.) The adjusted R-squared estimates indicating the goodness of fit of the exponential regression model (with values closer to 1 indicating a better fit). Based on data from the 2020-03-04. Light grey ribbon = 95% credible interval; dark grey ribbon = the interquartile range. Confidence in the estimated values is indicated by shading with reduced shading corresponding to reduced confidence.
| Estimate | |
|---|---|
| Cases with date of onset on the day of report generation | 19 – 86 |
| Effective reproduction no. | 1.1 – 1.9 |
| Rate of spread | 0.053 – 0.54 |
| Doubling time (days) | 1.3 – 13 |
| Adjusted R-squared | 0.1 – 0.91 |
Table 3: Latest estimates of the number of cases by date of onset, the effective reproduction number, the rate of spread, the doubling time, and the adjusted R-squared of the exponential fit for the 2020-03-04. Based on the last 7 days of data. The 95% credible interval is shown for each estimate.
Figure 5: Cases by date of report (bars) and estimated cases by date of onset (light grey ribbon = 95% credible interval; dark grey ribbon = the interquartile range) based on data from the 2020-03-04. Confidence in the estimated values is indicated by shading with reduced shading corresponding to reduced confidence.
Figure 6: Time-varying estimate of the effective reproduction no. (light grey ribbon = 95% credible interval; dark grey ribbon = the interquartile range) based on data from the 2020-03-04. Confidence in the estimated values is indicated by shading with reduced shading corresponding to reduced confidence. The dotted line indicates the target value of 1 for the effective reproduction no. required for control.
Figure 7: A.) Time-varying estimate of the rate of spread, B.) Time-varying estimate of the doubling time in days (note that when the rate of spread is negative the doubling time is assumed to be infinite), C.) The adjusted R-squared estimates indicating the goodness of fit of the exponential regression model (with values closer to 1 indicating a better fit). Based on data from the 2020-03-04. Light grey ribbon = 95% credible interval; dark grey ribbon = the interquartile range. Confidence in the estimated values is indicated by shading with reduced shading corresponding to reduced confidence.
| Estimate | |
|---|---|
| Cases with date of onset on the day of report generation | 1 – 28 |
| Effective reproduction no. | 0.4 – 2.7 |
| Rate of spread | -5.3 – 6.4 |
| Doubling time (days) | 0.11 – Decreasing |
| Adjusted R-squared | -0.33 – 0.54 |
Table 4: Latest estimates of the number of cases by date of onset, the effective reproduction number, the rate of spread, the doubling time, and the adjusted R-squared of the exponential fit for the 2020-03-04. Based on the last 7 days of data. The 95% credible interval is shown for each estimate.
Figure 8: Cases by date of report (bars) and estimated cases by date of onset (light grey ribbon = 95% credible interval; dark grey ribbon = the interquartile range) based on data from the 2020-03-04. Confidence in the estimated values is indicated by shading with reduced shading corresponding to reduced confidence.
Figure 9: Time-varying estimate of the effective reproduction no. (light grey ribbon = 95% credible interval; dark grey ribbon = the interquartile range) based on data from the 2020-03-04. Confidence in the estimated values is indicated by shading with reduced shading corresponding to reduced confidence. The dotted line indicates the target value of 1 for the effective reproduction no. required for control.
Figure 10: A.) Time-varying estimate of the rate of spread, B.) Time-varying estimate of the doubling time in days (note that when the rate of spread is negative the doubling time is assumed to be infinite), C.) The adjusted R-squared estimates indicating the goodness of fit of the exponential regression model (with values closer to 1 indicating a better fit). Based on data from the 2020-03-04. Light grey ribbon = 95% credible interval; dark grey ribbon = the interquartile range. Confidence in the estimated values is indicated by shading with reduced shading corresponding to reduced confidence.
| Estimate | |
|---|---|
| Cases with date of onset on the day of report generation | 46 – 204 |
| Effective reproduction no. | 0.5 – 0.6 |
| Rate of spread | -0.32 – -0.085 |
| Doubling time (days) | Decreasing – Decreasing |
| Adjusted R-squared | 0.47 – 0.89 |
Table 5: Latest estimates of the number of cases by date of onset, the effective reproduction number, the rate of spread, the doubling time, and the adjusted R-squared of the exponential fit for the 2020-03-04. Based on the last 7 days of data. The 95% credible interval is shown for each estimate.
Figure 11: Cases by date of report (bars) and estimated cases by date of onset (light grey ribbon = 95% credible interval; dark grey ribbon = the interquartile range) based on data from the 2020-03-04. Confidence in the estimated values is indicated by shading with reduced shading corresponding to reduced confidence.
Figure 12: Time-varying estimate of the effective reproduction no. (light grey ribbon = 95% credible interval; dark grey ribbon = the interquartile range) based on data from the 2020-03-04. Confidence in the estimated values is indicated by shading with reduced shading corresponding to reduced confidence. The dotted line indicates the target value of 1 for the effective reproduction no. required for control.
Figure 13: A.) Time-varying estimate of the rate of spread, B.) Time-varying estimate of the doubling time in days (note that when the rate of spread is negative the doubling time is assumed to be infinite), C.) The adjusted R-squared estimates indicating the goodness of fit of the exponential regression model (with values closer to 1 indicating a better fit). Based on data from the 2020-03-04. Light grey ribbon = 95% credible interval; dark grey ribbon = the interquartile range. Confidence in the estimated values is indicated by shading with reduced shading corresponding to reduced confidence.
| Estimate | |
|---|---|
| Cases with date of onset on the day of report generation | 406 – 676 |
| Effective reproduction no. | 1.8 – 2.2 |
| Rate of spread | 0.066 – 0.19 |
| Doubling time (days) | 3.6 – 10 |
| Adjusted R-squared | 0.58 – 0.93 |
Table 6: Latest estimates of the number of cases by date of onset, the effective reproduction number, the rate of spread, the doubling time, and the adjusted R-squared of the exponential fit for the 2020-03-04. Based on the last 7 days of data. The 95% credible interval is shown for each estimate.
Figure 14: Cases by date of report (bars) and estimated cases by date of onset (light grey ribbon = 95% credible interval; dark grey ribbon = the interquartile range) based on data from the 2020-03-04. Confidence in the estimated values is indicated by shading with reduced shading corresponding to reduced confidence.
Figure 15: Time-varying estimate of the effective reproduction no. (light grey ribbon = 95% credible interval; dark grey ribbon = the interquartile range) based on data from the 2020-03-04. Confidence in the estimated values is indicated by shading with reduced shading corresponding to reduced confidence. The dotted line indicates the target value of 1 for the effective reproduction no. required for control.
Figure 16: A.) Time-varying estimate of the rate of spread, B.) Time-varying estimate of the doubling time in days (note that when the rate of spread is negative the doubling time is assumed to be infinite), C.) The adjusted R-squared estimates indicating the goodness of fit of the exponential regression model (with values closer to 1 indicating a better fit). Based on data from the 2020-03-04. Light grey ribbon = 95% credible interval; dark grey ribbon = the interquartile range. Confidence in the estimated values is indicated by shading with reduced shading corresponding to reduced confidence.
| Estimate | |
|---|---|
| Cases with date of onset on the day of report generation | 1 – 53 |
| Effective reproduction no. | 0.7 – 1.4 |
| Rate of spread | -0.33 – 0.21 |
| Doubling time (days) | 3.3 – Decreasing |
| Adjusted R-squared | -0.17 – 0.61 |
Table 7: Latest estimates of the number of cases by date of onset, the effective reproduction number, the rate of spread, the doubling time, and the adjusted R-squared of the exponential fit for the 2020-03-04. Based on the last 7 days of data. The 95% credible interval is shown for each estimate.
Figure 17: Cases by date of report (bars) and estimated cases by date of onset (light grey ribbon = 95% credible interval; dark grey ribbon = the interquartile range) based on data from the 2020-03-04. Confidence in the estimated values is indicated by shading with reduced shading corresponding to reduced confidence.
Figure 18: Time-varying estimate of the effective reproduction no. (light grey ribbon = 95% credible interval; dark grey ribbon = the interquartile range) based on data from the 2020-03-04. Confidence in the estimated values is indicated by shading with reduced shading corresponding to reduced confidence. The dotted line indicates the target value of 1 for the effective reproduction no. required for control.
Figure 19: A.) Time-varying estimate of the rate of spread, B.) Time-varying estimate of the doubling time in days (note that when the rate of spread is negative the doubling time is assumed to be infinite), C.) The adjusted R-squared estimates indicating the goodness of fit of the exponential regression model (with values closer to 1 indicating a better fit). Based on data from the 2020-03-04. Light grey ribbon = 95% credible interval; dark grey ribbon = the interquartile range. Confidence in the estimated values is indicated by shading with reduced shading corresponding to reduced confidence.
| Estimate | |
|---|---|
| Cases with date of onset on the day of report generation | 1 – 21 |
| Effective reproduction no. | 0.6 – 2.7 |
| Rate of spread | -4.1 – 4.6 |
| Doubling time (days) | 0.15 – Decreasing |
| Adjusted R-squared | -0.25 – 0.61 |
Table 8: Latest estimates of the number of cases by date of onset, the effective reproduction number, the rate of spread, the doubling time, and the adjusted R-squared of the exponential fit for the 2020-03-04. Based on the last 7 days of data. The 95% credible interval is shown for each estimate.
Figure 20: Cases by date of report (bars) and estimated cases by date of onset (light grey ribbon = 95% credible interval; dark grey ribbon = the interquartile range) based on data from the 2020-03-04. Confidence in the estimated values is indicated by shading with reduced shading corresponding to reduced confidence.
Figure 21: Time-varying estimate of the effective reproduction no. (light grey ribbon = 95% credible interval; dark grey ribbon = the interquartile range) based on data from the 2020-03-04. Confidence in the estimated values is indicated by shading with reduced shading corresponding to reduced confidence. The dotted line indicates the target value of 1 for the effective reproduction no. required for control.
Figure 22: A.) Time-varying estimate of the rate of spread, B.) Time-varying estimate of the doubling time in days (note that when the rate of spread is negative the doubling time is assumed to be infinite), C.) The adjusted R-squared estimates indicating the goodness of fit of the exponential regression model (with values closer to 1 indicating a better fit). Based on data from the 2020-03-04. Light grey ribbon = 95% credible interval; dark grey ribbon = the interquartile range. Confidence in the estimated values is indicated by shading with reduced shading corresponding to reduced confidence.
| Estimate | |
|---|---|
| Cases with date of onset on the day of report generation | 394 – 693 |
| Effective reproduction no. | 1.1 – 1.3 |
| Rate of spread | -0.054 – 0.027 |
| Doubling time (days) | 26 – Decreasing |
| Adjusted R-squared | -0.17 – 0.45 |
Table 9: Latest estimates of the number of cases by date of onset, the effective reproduction number, the rate of spread, the doubling time, and the adjusted R-squared of the exponential fit for the 2020-03-04. Based on the last 7 days of data. The 95% credible interval is shown for each estimate.
Figure 23: Cases by date of report (bars) and estimated cases by date of onset (light grey ribbon = 95% credible interval; dark grey ribbon = the interquartile range) based on data from the 2020-03-04. Confidence in the estimated values is indicated by shading with reduced shading corresponding to reduced confidence.
Figure 24: Time-varying estimate of the effective reproduction no. (light grey ribbon = 95% credible interval; dark grey ribbon = the interquartile range) based on data from the 2020-03-04. Confidence in the estimated values is indicated by shading with reduced shading corresponding to reduced confidence. The dotted line indicates the target value of 1 for the effective reproduction no. required for control.
Figure 25: A.) Time-varying estimate of the rate of spread, B.) Time-varying estimate of the doubling time in days (note that when the rate of spread is negative the doubling time is assumed to be infinite), C.) The adjusted R-squared estimates indicating the goodness of fit of the exponential regression model (with values closer to 1 indicating a better fit). Based on data from the 2020-03-04. Light grey ribbon = 95% credible interval; dark grey ribbon = the interquartile range. Confidence in the estimated values is indicated by shading with reduced shading corresponding to reduced confidence.
| Estimate | |
|---|---|
| Cases with date of onset on the day of report generation | 15 – 74 |
| Effective reproduction no. | 1.4 – 2.6 |
| Rate of spread | 0.051 – 0.39 |
| Doubling time (days) | 1.8 – 13 |
| Adjusted R-squared | 0.18 – 0.94 |
Table 10: Latest estimates of the number of cases by date of onset, the effective reproduction number, the rate of spread, the doubling time, and the adjusted R-squared of the exponential fit for the 2020-03-04. Based on the last 7 days of data. The 95% credible interval is shown for each estimate.
Figure 26: Cases by date of report (bars) and estimated cases by date of onset (light grey ribbon = 95% credible interval; dark grey ribbon = the interquartile range) based on data from the 2020-03-04. Confidence in the estimated values is indicated by shading with reduced shading corresponding to reduced confidence.
Figure 27: Time-varying estimate of the effective reproduction no. (light grey ribbon = 95% credible interval; dark grey ribbon = the interquartile range) based on data from the 2020-03-04. Confidence in the estimated values is indicated by shading with reduced shading corresponding to reduced confidence. The dotted line indicates the target value of 1 for the effective reproduction no. required for control.
Figure 28: A.) Time-varying estimate of the rate of spread, B.) Time-varying estimate of the doubling time in days (note that when the rate of spread is negative the doubling time is assumed to be infinite), C.) The adjusted R-squared estimates indicating the goodness of fit of the exponential regression model (with values closer to 1 indicating a better fit). Based on data from the 2020-03-04. Light grey ribbon = 95% credible interval; dark grey ribbon = the interquartile range. Confidence in the estimated values is indicated by shading with reduced shading corresponding to reduced confidence.
1 World Health Organization. Coronavirus disease (COVID-2019) situation reports. https://www.who.int/emergencies/diseases/novel-coronavirus-2019/situation-reports
2 Brown E. Data2019nCoV: Data on the covid-19 outbreak. 2020.
3 Xu B, Gutierrez B, Hill S et al. Epidemiological Data from the nCoV-2019 Outbreak: Early Descriptions from Publicly Available Data. 2020.
4 Cori A. EpiEstim: Estimate time varying reproduction numbers from epidemic curves. 2019. https://CRAN.R-project.org/package=EpiEstim
5 Thompson R, Stockwin J, Gaalen R van et al. Improved inference of time-varying reproduction numbers during infectious disease outbreaks. Epidemics 2019;29:100356. doi:https://doi.org/10.1016/j.epidem.2019.100356
6 Nishiura H, Linton NM, Akhmetzhanov AR. Serial interval of novel coronavirus (2019-nCoV) infections. medRxiv Published Online First: 2020. doi:10.1101/2020.02.03.20019497
7 Thompson RN. 2019-20 Wuhan coronavirus outbreak: Intense surveillance is vital for preventing sustained transmission in new locations. bioRxiv 2020;1–14.
8 Stan Development Team. RStan: The R interface to Stan. 2020.http://mc-stan.org/
9 R Core Team. R: A language and environment for statistical computing. Vienna, Austria:: R Foundation for Statistical Computing 2019. https://www.R-project.org/
10 Cori A, Ferguson NM, Fraser C et al. A New Framework and Software to Estimate Time-Varying Reproduction Numbers During Epidemics. American Journal of Epidemiology 2013;178:1505–12. doi:10.1093/aje/kwt133
11 Wallinga J, Teunis P. Different Epidemic Curves for Severe Acute Respiratory Syndrome Reveal Similar Impacts of Control Measures. American Journal of Epidemiology 2004;160:509–16. doi:10.1093/aje/kwh255
12 Park SW, Champredon D, Weitz JS et al. A practical generation-interval-based approach to inferring the strength of epidemics from their speed. Epidemics 2019;27:12–8. doi:https://doi.org/10.1016/j.epidem.2018.12.002